Segmentation of Heteropolymer Sequences Specifying Subsequences with Different Composition and Statistical Properties
نویسندگان
چکیده
Leonid V. Gusev, Valentina V. Vasilevskaya,* Vsevolod Ju. Makeev, Pavel G. Khalatur, Alexei R. Khokhlov* Nesmeyanov Institute of Organoelement Compounds, Russian Academy of Sciences, ul. Vavilova 28, Moscow 117823, Russia Fax: (þ7) 095 1355085; E-mail: [email protected] Physics Department, Moscow State University, Moscow 117234, Russia Department of Polymer Science, University of Ulm, Ulm D-089069, Germany State Scientific Centre GosNIIGenetika, 1 Dorozhny proezd 1, Moscow, 113545, Russia
منابع مشابه
Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM
Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...
متن کاملAnalysis of symbolic sequences using the Jensen-Shannon divergence.
We study statistical properties of the Jensen-Shannon divergence D, which quantifies the difference between probability distributions, and which has been widely applied to analyses of symbolic sequences. We present three interpretations of D in the framework of statistical physics, information theory, and mathematical statistics, and obtain approximations of the mean, the variance, and the prob...
متن کاملFinding Exact and Solo LTR-Retrotransposons in Biological Sequences Using SVM
Finding repetitive subsequences in genome is a challengeable problem in bioinformatics research area. A lot of approaches have been proposed to solve the problem, which could be divided to library base and de novo methods. The library base methods use predetermined repetitive genome’s subsequences, where library-less methods attempt to discover repetitive subsequences by analytical approach...
متن کاملThe effect of sequence on the conformational stability of a model heteropolymer in explicit water.
We investigate the properties of a two-dimensional lattice heteropolymer model for a protein in which water is explicitly represented. The model protein distinguishes between hydrophobic and polar monomers through the effect of the hydrophobic monomers on the entropy and enthalpy of the hydrogen bonding of solvation shell water molecules. As experimentally observed, model heteropolymer sequence...
متن کاملDensity-based Clustering of Time Series Subsequences
Doubts have been raised that time series subsequences can be clustered in a meaningful way. This paper introduces a kernel-density-based algorithm that detects meaningful patterns in the presence of a vast number of random-walk-like subsequences. The value of density-based algorithms for noise elimination in general has long been demonstrated. The challenge of applying such techniques to time-s...
متن کامل